Binary Neural Networks Algorithms, Architectures, and Applications (Baochang Zhang, Sheng Xu, Mingbao Lin etc.)

PCNN: Projection Convolutional Neural Networks

FIGURE 3.13

In our proposed progressive optimization framework, the two additional losses, projection

loss, and center loss are simultaneously optimized in continuous and discrete spaces, opti-

mally combined by the projection approach in a theoretical framework. The subﬁgure on

the left explains the softmax function in the cross-entropy loss. The subﬁgure in the mid-

dle illustrates the process of progressively turning ternary kernel weights into binary ones

within our projection approach. The subﬁgure on the right shows the function of center loss

to force the learned feature maps to cluster together, class by class.

To alleviate the disturbance caused by the quantization process, intraclass compactness

is further deployed based on the center loss function [245] to improve performance. Given

the input features xi ∈R^dor Ω and the yith class center cyi ∈R^dor Ω of the input features,

we have

LC = ^γ

i=1

∥xi −cyi∥²

2^,

(3.60)

where m denotes the total number of samples or batch size, and γ is a hyperparameter to

balance the center loss with other losses. More details on center loss can be found in [245].

By incorporating Eq. 3.60 into Eq. 3.110, the total loss is updated as

L = LS + LP + LC.

(3.61)

We note that the center loss is successfully deployed to handle feature variations in the

training and will be omitted in the inference, so there is no additional memory storage

and computational cost. More intuitive illustrations can be found in Fig. 3.13, and a more

detailed training procedure is described in Algorithm 3.